Fast Rates for Empirical Risk Minimization of Strict Saddle Problems
نویسندگان
چکیده
We derive bounds on the sample complexity of empirical risk minimization (ERM) in the context of minimizing non-convex risks that admit the strict saddle property. Recent progress in non-convex optimization has yielded efficient algorithms for minimizing such functions. Our results imply that these efficient algorithms are statistically stable and also generalize well. In particular, we derive fast rates which resemble the bounds that are often attained in the strongly convex setting. We specify our bounds to Principal Component Analysis and Independent Component Analysis. Our results and techniques may pave the way for statistical analyses of additional strict saddle problems.
منابع مشابه
Ranking and Scoring Using Empirical Risk Minimization
A general model is proposed for studying ranking problems. We investigate learning methods based on empirical minimization of the natural estimates of the ranking risk. The empirical estimates are of the form of a U -statistic. Inequalities from the theory of U -statistics and U processes are used to obtain performance bounds for the empirical risk minimizers. Convex risk minimization methods a...
متن کاملExploiting Strong Convexity from Data with Primal-Dual First-Order Algorithms
We consider empirical risk minimization of linear predictors with convex loss functions. Such problems can be reformulated as convex-concave saddle point problems, and thus are well suitable for primal-dual first-order algorithms. However, primal-dual algorithms often require explicit strongly convex regularization in order to obtain fast linear convergence, and the required dual proximal mappi...
متن کاملRanking - convex risk minimization
The problem of ranking (rank regression) has become popular in the machine learning community. This theory relates to problems, in which one has to predict (guess) the order between objects on the basis of vectors describing their observed features. In many ranking algorithms a convex loss function is used instead of the 0−1 loss. It makes these procedures computationally efficient. Hence, conv...
متن کاملAdaptive Stochastic Primal-Dual Coordinate Descent for Separable Saddle Point Problems
We consider a generic convex-concave saddle point problem with a separable structure, a form that covers a wide-ranged machine learning applications. Under this problem structure, we follow the framework of primal-dual updates for saddle point problems, and incorporate stochastic block coordinate descent with adaptive stepsizes into this framework. We theoretically show that our proposal of ada...
متن کاملFast Learning Rates for Plug-in Classifiers by Jean-yves Audibert
It has been recently shown that, under the margin (or low noise) assumption, there exist classifiers attaining fast rates of convergence of the excess Bayes risk, that is, rates faster than n−1/2. The work on this subject has suggested the following two conjectures: (i) the best achievable fast rate is of the order n−1, and (ii) the plug-in classifiers generally converge more slowly than the cl...
متن کامل